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This paper is a summary overview of a study 
conducted at the NASA Marshall Space Flight 
Center (MSFC) during the initial phases of the 
Space Launch Initiative (SLI) program to evaluate a 
large number of technical problems associated with 
the design, development, test, evaluation and 
operation of several major liquid propellant rocket 
engine systems (i.e., SSME, Fastrac, J-2, F-l). The 
results of this study was the identification of the 
“Fundamental Root Causes” that enabled the 
technical problems to manifest, and practices that 
can be implemented to prevent them from recurring 
in future engine development efforts. This paper 
will discus the Fundamental Root Causes, cite some 
examples of how the technical problems arose from 
them, and provide a discussion of how they can be 
mitigated or avoided. 


Introduction 

The NASA SLI program was initiated under NRA8-30 
to begin development of a space launch system that 
would be significantly safer and more economical to 
operate than current launch systems. SLI was identified 
as part of the Integrated Space Transportation Plan 
(ISTP) and followed on the NRA8-27 study to define 
an optimal roadmap that would produce a 2 nd - 
Generation Reusable Launch Vehicle (2GRLV). The 
objective of the NRA8-27 study was to identify risk 
reduction areas and were applicable to several 2GRLV 
architectures by performing cycle analyses and trade 
studies on applicable propulsion systems. Risk 
reduction activities were then identified to mature the 
technologies and cycles to production status. Other 
elements of the ISTP identified at that time included 
upgrades for safety of NASA’s l st -generation RLV, the 
space shuttle, and developing technologies for 3 rd - and 
4 th -generation transportation systems. 
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The 2GRLV program was to build on NASA’s then- 
current programs (e.g., X-33, X-34 and X-37) — testing 
new materials, structures, propulsion, computers and 
other technologies needed to meet the program’s goal 
of significantly increasing safety to a 1 in 10,000 
chance of loss of life and reducing payload launch costs 
from $10,000 per pound today to $1,000 per pound. 

The scope of NRA8-30 covered more than just the 
propulsion facet of space transportation. The ten 
technology areas (TAs) worked on all elements of the 
next manned space launch infrastructure. In addition, 
NRA8-30 was separated into multiple cycles and 


1 

American Institute of Aeronautics & Astronautics 


phases to permit management flexibility. Cycle-1 
would focus on initial prototype development and risk 
reduction, with Cycle-2 culminating in the 
demonstration by test of the prototype engine. Phase-2 
of the SLI program would build on the foundation laid 
by the prototype engine project by the design, 
development, test, and deployment of the human-rated 
full-scale development (FSD) flight engine. 

At the beginning of the SLI program, it became 
apparent that NASA was embarking on a program to 
fully develop a selection of “clean sheet” rocket engines 
to power the next generation of reusable launch 
vehicles. It also became apparent that the prerequisite 
experience for development of the complex rocket 
engine systems had significantly thinned since NASA 
had last been involved in a clean-sheet rocket engine 
program, namely, the Space Shuttle Main Engine 
(SSME) development program conducted over a quarter 
century previously. Even then, the SSME program was 
able to utilize the relatively fresh experience in rocket 
engine development resulting from the Apollo program. 
By comparison, the body of knowledge available for 
application to the SLI engines was significantly more 
scarce, effectively either buried in a mountain of 
historical documents or residing in a diminishing 
number of technical consultants that had actual 
hardware development experience (respectfully known 
as “greybeards”). 

Looking ahead at the aggressive schedule projected by 
the SLI program, it was seen as necessary to try and 
anticipate some of the obstacles that could be 
encountered in the development of a prototypical 
engine system, and the means by which to avoid them. 
Previous rocket engine development programs had 
relied on the “test-fail-fix” philosophy of using 
hardware testing to wring out problems at the expense 
of destroyed test articles and abused test facilities. The 
expense involved in using this development philosophy 
was prohibitive in view of the more fiscally 
conservative environment and the fact that there were 
several concurrent engine development programs 
gearing up rather than just one. The problems resolved 
in the development of the SSME had been exhaustively 
documented, as well as similar impediments 
encountered in other rocket engine development 
programs (i.e., F-l, J-2, H-l, MC-1, etc.). However, 
the technical issues initially identified appeared highly 
specific to design elements of the particular engine 
system, which could be very difficult to effectively 
apply to a clean-sheet engine design. The realization 
developed that what was really needed was to look one 
level higher and try to identify the “fundamental root 
cause” that enabled the technical problem(s) to manifest 
in the first place. 


The REIMR Study 

A study was initiated at MSFC to begin development of 
a risk mitigation tool to assist in the development of 
liquid propellant rocket engines, as well as the process 
for the continuing enhancement of the tool and its 
effective use. The tool, known as Rocket Engine Issue 
Mitigation Resource (REIMR), can also be applied in a 
broader sense to almost any complex system 
development effort through the understanding and 
application of the Fundamental Root Cause (FRC) 
philosophy that the study identified. 

The REIMR study had several primary and secondary 
objectives: 

Primary 

> Initially, the study was to document and study 
possible technical issues that could be 
encountered in the development of a clean-sheet 
rocket engine. As more results developed, the 
objectives of the study were adjusted to include 
identification of the significant technical and 
fundamental root causes for the problems that 
have occurred during rocket engine development, 
and apply this knowledge to improve future liquid 
rocket engine programs with emphasis on 
reusable manned systems. 

> Establish process to allow personnel to contribute 
to and benefit from past applicable engine 
experience in both broad and narrow focus. This 
process was oriented toward reducing technical 
risk of future programs. 

> The goal of this effort was not so much to 
identify the technical issues that can occur, but 
more to illuminate the fundamental root causes 
that allowed the technical issue to develop. 

Secondary 

> Expand the experience base of personnel that will 
be supporting the 2GRLV program in terms of 
reusable liquid propellant rocket engines. 

> Infuse an understanding of the sensitive trades 
that go into the engine development process by 
using examples derived from SSME. 

The initial basis for the study was Bob Ryan’s “A 
History of Aerospace Problems, Their Solutions, Their 
Lessons” which contained a comprehensive selection of 
issues encountered during the development of a number 
of propulsion systems, especially the SSME ll] . These 
issues provided the initial set of subjects that the 
REIMR study focused on, where additional information 
regarding the issue was researched to determine a more 
indepth understanding of the problem, how it 
developed, and how it was solved. Additional issues 
and supporting information was also derived from other 
“Lessons Learned” activities, mishap/failure reports, 
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personal interviews, and other archived information. 
The primary sources used to support the REIMR study 
are shown in Figure 1. 

As more issues were identified and studied, the process 
for understanding and utilizing them collectively was 
developed, which is shown in Figure 2. This process 
started out with reviewing existing engine development 
summaries and “Lessons Learned” documentation to 
identify the specific issue to be researched, followed by 
“data mining” from validated sources/databases and 
interviews with personnel with detailed knowledge of 
the problem. This was initially focused on 
documenting all the technical causes of the engine 
issues and look for any similarity in the candidate 


engines being developed under the 2GRLV program. 
However, it became apparent that trying to match the 
operational or design event that had caused the issue to 
the emerging specifics of any of the 2GRLV engines 
was a “hit-or-miss” affair, being very difficult to 
accurately match the “Lesson Learned” to the potential 
“Lesson-to-be-Learned.” Identification of a score of 
more generic symptoms, referred to as “Fundamental 
Root Causes,” permitted the study group to review the 
2GRLV development engines at a system level. The 
evolved REIMR process took the standard “Lessons 
Learned” exercise one step further. After an individual 
or subgroup collected required relevant material on a 
particular issue, it was reviewed in consensus with the 

rest of the study group to 
identify the FRC(s) that 
precipitated the issue. 
Identification of the FRC 
and the issue itself was also 
recommended. The flow of 
cause-and-effect for a 
specific issue and how the 
FRC integrates into the flow 
is shown in Figure 3. 

By the time information was 
due to be released on the 
2GRLV engines, the REIMR 
database was expected to 
have achieved a sufficient 
level of maturity to permit a 
preliminary checklist to be 
extracted to compare the 
engines at the system level 
against the FRCs and at the 
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Figure 1 : Primary Sources Supporting the REIMR Study 



Figure 2: The REIMR Development Process 


component level against 
specific technical issues. 

Any relevant issues 
identified by the REIMR 
checklist would be tracked 
for potential mitigation. It 
was expected that the initial 
release of information on the 
2GRLV engines would 
probably not be at a high 
level of detail. For this 
reason, REIMR was used as 
a tool to help guide the 
engine DDT&E process in 
Cycle- 1 of the 2GRLV 
program. 

Effort was made to keep the 
number of FRCs small. A 
large number of root causes 
were initially identified, but 
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many were actually subsets or reflections of the FRCs 
that REIMR utilized. Many of the FRCs identified in 
the REIMR study come as no great surprise to an 
experienced systems engineer and can be largely seen 
as common sense. The reasons for why these violations 
of common sense occur is beyond the scope of this 
paper or the study. 



Figure 3: Issue Cause & Effect 


Fundamental Root Causes 

Identification of the FRCs was not an epiphany that 
suddenly happened, but was rather a progressive 
understanding of some of the higher-order predecessors 
that can spawn a particular problem during the life 
cycle of a rocket engine, ranging from conceptual 
development to flight. As more and more issues were 
collected and studied, one or more FRCs could often be 
identified that enabled the problem to manifest. 

The FRCs currently used in REIMR, as well as 
descriptions and examples are as follows: 

Inadequate understanding of the engine 
environment 

This fundamental root cause includes adequacy of 
analysis tools & techniques used to predict the 
physical environment in the engine, the ability of 
the instrumentation system to measure the 
environment, and all other physical or conceptual 
reasons the real engine environment is different 
than the predicted value used during the design 
process. 

The SSME hot gas system provided several 
examples of this FRC enabling technical problems, 
specifically, recurring incidences of sheet metal 
cracking in the turbine turnaround ducts. The lack 
of understanding of the engine environment did not 
permit the sheet metal to be designed with 
sufficient coolant flow, which precipitated the 
initiation and propagation of the cracks. The 
corrective action required for this problem was to 
inspect and track the propagation of the cracks, 
then perform a weld repair on any crack that got 
too long. The consequence of this problem was 
expensive and time-consuming inspection, 
maintenance and repair operations. Resolution of 
this issue was accomplished as a result of the 
Technology TestBed (TTB) program conducted at 
MSFC in the mid-1990’s, where a highly- 
instrumented SSME was subjected to a test 
program that permitted a more penetrating 
characterization of the engine internal 
environment. As a result, when Pratt & Whitney 
designed the Advanced Turbopump Development 
(ATD) turbomachinery for SSME, this expanded 
data allowed effective elimination of the sheet- 
metal cracking problem. 

Inadeauate systems engineering and integration 



This fundamental root cause captures problems 
resulting from not adequately addressing all 
aspects of the systems engineering trade studies. 
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including reusability, reliability, maintainability, 
manufacturability, and performance. 

The design of the SSME heat exchanger has been a 
source of concern throughout the SSME program 
history, in that any leakage of the GOX from the 
heat exchanger into the fuel-rich hot gas system is 
a Crit-1 failure mode that can cause a loss of 
vehicle or crew. The original heat exchanger 
design utilized a dual-tube configuration that had 
several critical welds that were difficult to 
accomplish and inspect. The exposed thin-walled 
tubing extending into the hot gas flowpath also 
made it susceptible to damage from FOD impacts. 
The high heat transfer requirement and chosen 
method of tank pressurization drove the design, but 
the design trades did not take into account the 
manufacturing difficulties and FOD intolerance. 
One mitigation measure implemented was to 
change the design to a single-tube configuration 
that had fewer welds. 

Inadequate resources 

This fundamental root cause captures problems 
resulting from inadequate budget, schedule, 
personnel, equipment, or facilities being made 
available when needed. 

The MC-1 engine development program had many 
instances of insufficient resources causing 
recurring problems in development hardware, 
especially the engine valves. One of the goals of 
the MC-1 program was to demonstrate the ability 
to develop a flight-certified engine for use on the 
X-34 vehicle at a fraction of the historical recurring 
and non-recurring costs. In this respect, the 
program was successful, but the consequences 
included a dire shortage of development hardware 
and temperamental engine valves. As a result, 
budget and schedule were affected by repeated 
trouble-shooting of valve problems for which there 
were few replacements available. A shortage of 
development hardware also required constant 
cannibalization of off-stand engines to support the 
ongoing development test program, causing lost 
schedule and hardware tracking headaches. 

The initial SSME development program rushed 
into system testing, sacrificing the potential 
benefits of component- and/or subsystem-level 
testing in order to shorten the development 
schedule and cost. This made any test failures 
more costly as the failure occurred at the system 
level, rather than at the component level. 


Over estimation of technology base 

This fundamental root cause captures issues where 
overly optimistic design goals established 
unrealistic design requirements, and were caused 
by an over estimation of the state-of-the-art of 
technology at that time. This also addresses an 
inadequate understanding of the technical risk or 
current technology readiness level (TRL). 

Examples of this fundamental root cause are 
numerous, both at a programmatic level (i.e., 
NASP, X-33) and further down at the analytical or 
component design level. Other examples of this 
include overestimation of the technical maturity of 
the materials, manufacturing processes or avionics 
applied to an engine development program, such as 
in projects involved in the development of an 
integrated engine health management system 
(IEHMS). Experience has repeatedly shown that 
the complexity involved in developing an effective 
IEHMS is hard to over-e stimate. 

The complexity of SSME data reduction required 
numerical methods that were not within the state- 
of-the-art computational capabilities at the time of 
initial SSME development, and has only recently 
been identified as being feasible for use in an 
engine health monitoring system. 

Inadequate quality processes 

This fundamental root cause captures problems 
resulting from inadequate quality processes, or 
conversely problems which would have not 
occurred if quality process had been followed or if 
appropriate quality process had been in place. This 
FRC includes ‘mistakes’, or human-factor events if 
the event could have been precluded with a 
“quality” or management process in place. 

Several engine test failures were caused in the 
SSME by quality process failures allowing the 
introduction of FOD contamination (e.g., LOX 
tape) during assembly or maintenance operations. 
Other process failures include utilization of 
incorrect weld wire, which caused a catastrophic 
SSME steerhorn failure at the assembly weld, or 
failure to install an actuator coupling during a 
valve change-out, causing a premature cut-off 
during a test. 

Immature mission/vehicle design requirements 

imposed unnecessary engine requirements. 

This fundamental root cause captures problems 
caused by the flow down of immature or unrealistic 
mission or vehicle requirements. While this is 
similar to the inadequate SE&I trades FRC, it is 
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differentiated by being higher level requirements 
that the engine program had no control. 

An example of an immature requirement was for 
the SSME to have independent thrust and mixture 
ratio control. This was a requirement levied by the 
vehicle to permit thrust control to achieve the 
desired flight trajectory, and mixture ratio control 
to optimize ascent performance and minimize 
residual propellants at main engine cut-off 
(MECO). This forced the engine system design to 
utilize a dual preburner configuration, which 
significantly increased the complexity of the 
engine system and subsequently the number of 
concerns to solve. As it turned out, the 
requirement for mixture ratio control during flight 
was eventually eliminated from the vehicle, but too 
late to be reflected in a more simple SSME design. 

Another example of an immature engine 
requirement was that of the high thrust-to-weight 
ratio (TAV) levied on the SSME during its initial 
development. This requirement is generally based 
on the vehicle being able to carry as much 
propellant or payload as possible by forcing the 
vehicle systems to be as light as possible. This 
forced engine weight to be at a premium, resulting 
in development of high pressure, high- 
performance, low- weight components with a 
corresponding high number of component life and 
safety concerns. This required extensive 
inspections and maintenance between operations to 
mitigate. The high TAV requirement levied by the 
vehicle also turned out to be largely unnecessary, 
as the first glide flights of the Shuttle identified a 
stability concern that was corrected by the 
installation of ballast in the vehicle boat-tail. As 
the SSME weight was increased over the years as a 
consequence of block upgrades to enhance 
reliability, the vehicle ballast was progressively 
removed. 

Inadequate understanding of assembly 

environments and process variability. 

This FRC captures problems resulting from not 
adequately understanding or considering the 
manufacturing and assembly environments and 
process variability. This includes proper 
concurrent engineering processes to design for 
manufacturability. Failure to overcome this FRC 
will result in a high reject rate of fabricating parts 
or elevated inspection and maintenance needs. 

During SSME post-flight inspections, cracking was 
identified on a turbopump shaft bearing inner race. 
An investigation showed that the cracking had 
initiated at a corrosion pit and traces of chlorine 


were detected on the part. Some changes in the 
manufacturing process and drying procedures had 
been instituted in a new manufacturing facility that 
were different from those used in the original 
development pump room. The drying procedure to 
eliminate moisture prior to bearing installation did 
not work properly at the new facility and permitted 
the trapping of moisture between the race and 
shaft. Future mitigation would be to ensure that 
the component design and assembly process allows 
for the removal of moisture from the assembly 
stack and eliminate potential for trapping of 
moisture. 

Inadequate understanding of material properties. 

This FRC captures problems resulting from 
inaccurate or incomplete material performance 
information used during the design and analysis 
process. This includes proper consideration of 
allowable variations within specification. 

Identification and mitigation of the effects of 
hydrogen exposure embrittlement (HEE) to engine 
materials should always be taken into account. For 
example, the SSME experienced a catastrophic test 
failure caused by failure of a 2 nd -stage turbine 
blade. The blade failure was caused by internal 
crack growth of a pre-existing subsurface flaw 
embrittled by hydrogen exposure. The 
embrittlement was a result of hydrogen exposure 
through microshrinkage porosity or by diffusion as 
a result of long-term exposure. 

Inadequate design margins. 

This FRC captures problems resulting from design 
requirements with optimistically low margins of 
safety and is related to the “Over-estimation of 
technology base,” but at a lower level application. 

An example of this FRC is the investigation and 
mitigation of high synchronous rotordynamic 
vibration in the SSME HPOTP caused by lack of 
margin in the bearing design to account for 
unknown hydrodynamic influences. The 
identification and resolution of this anomaly was 
conducted during component-level testing. This 
shows the importance of component-level testing 
under realistic conditions to work out design and 
operational problems early. 

Inadequate or loosely-worded requirements or 

specifications 

The FRC captures problems resulting from 
requirements or specifications that fail to 
adequately capture what is required from the 
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system or component. This can be a result of 
wording the requirement or specification such that 
there is too much “wiggle-room” allowing 
unacceptable materials or components to be used. 

A good requirement provides a balance between 
ensuring that the system needs are achieved while 
leaving enough latitude to permit the designers to 
reach the optimal design solution. 

Pre-emptive mitigation for this gremlin is to 
baseline down the requirements as early and as 
thoroughly as possible. Some change in the 
requirements is permitted so long as it is 
understood that the larger the change, the more 
impact in budget and schedule it will cause. 
Further, any requirements changes after the engine 
Preliminary Design Review (PDR) should be kept 
to an absolute minimum. Make sure the 
requirements do not force the designers into using 
a specific design solution or unnecessarily 
constrain the design trade space. Immature 
requirements imposed early can have a lasting 
impact. 

High performance requirements (Isp, T/W. etc ) 
drove design to be very sensitive to all design and 
operations parameters 

This fundamental root cause addresses the lack of 
margin or robustness in the engine system or 
component caused by the high performance 
requirements. 

For example, the high T/W requirement levied on 
the SSME during its initial development forced 
engine weight to be at a premium, causing 
development of high pressure turbomachinery with 
very high power densities. This sacrificed system 
robustness and made the turbomachinery highly 
sensitive to variances ranging between engine 
units. 

In addition, the high performance requirements 
(i.e., high Pc, dual preburner, high power density, 
high energy propellants, etc.) made test data 
reduction difficult due to the difficult measurement 
environment and the complex, closed loop nature 
of the SSME cycle. 

Many of the fundamental root causes are inter-related 
and often one will precipitate another. For example, 
when high performance requirements for T/W conflict 
with structural requirements for margin of safety, one 
will be given priority over the other unless the available 
materials can answer the needs of both. Then it 
becomes a question of whether the materials technology 
is mature enough to answer the needs of the engine, or 
if there are adequate resources available to develop it. 


The goal of the REIMR exercise is to identify which 
FRC is the primary initiator that gave rise to the others. 
Additional FRCs were limited to one secondary if 
needed. 

Application to Future Propulsion System 
Development 

Although the REIMR study was conducted to support 
the SLI / 2GRLV program, it can be easily extended to 
support any future propulsion system development 
program, including the ongoing Lunar/Mars exploration 
initiative. 

It is also important to note that while the technical 
issues collected in the REIMR database are primarily 
specific to liquid propellant rocket engines, the FRCs 
can be applied to almost any complex system. 

Conclusion 

It should be emphasized that the objective of this paper 
was not to provide a “Systems Engineering 101” or 
“Rocket Science for Dummies” tutorial, or to attack the 
SSME by parading out every problem it ever had. The 
REIMR study was useful in highlighting the top-level 
triggers that generate issues during the life cycle of a 
rocket engine, and then provide specific examples. 

With regard to the SSME, it has the distinction (and 
liability) of being one of the most long-lived (and 
extensively documented) rocket engine systems ever 
used, accumulating over a million seconds of total 
hotfire time. It has an amazing track record of 
performance and demonstrated reliability, and most of 
the rocket propulsion engineers at NASA have gained 
valuable experience by supporting the SSME program. 
However, its passage into history has not been without 
a few potholes, and those have to be understood lest 
they be repeated. 

In retrospect, the REIMR study had a few Lessons 
Learned of its own, including: 

> Potential perception as being another “Lessons 
Learned” (i.e., “Lessons Learned, Documented, 
and then Forgotten”) activity. 

> “Oh my gawd, not another database!” 

> A majority of the technical issues identified were 
primarily specific to SSME, so extrapolation was 
required to apply to 2GRLV main engines except 
through application of the Fundamental Root 
Causes. 

> REIMR focuses more on what was done wrong and 
not enough on what was done right. 

> Time to fully develop REIMR was very limited 
after the SLI/2GRLV propulsion projects began. 
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Overall, most of the results from the REIMR study 
come as no great surprise to an experienced systems 
engineer and can be largely seen as “common sense.” 
However, it has been shown that bringing all these bits 
of expensive wisdom together under one cover was 
useful in preparing the engineers tasked with supporting 
the 2GRLV program by providing a better sensitivity of 
what conditions to avoid or mitigate. Continued 
application of the REIMR process and the Fundamental 
Root Causes can be useful in the development of future 
propulsion systems and other complex systems. 


1 Ryan, R.S., “A History of Aerospace Problems, Their 
Solutions, Their Lessons,” NASA-TP-3653, 1996 
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